AITopics | optimal sketching

Collaborating Authors

optimal sketching

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimal Sketching for Trace Estimation

Neural Information Processing SystemsDec-24-2025, 21:46:09 GMT

Matrix trace estimation is ubiquitous in machine learning applications and has traditionally relied on Hutchinson's method, which requires $O(\log(1/\delta)/\epsilon^2)$ matrix-vector product queries to achieve a $(1 \pm \epsilon)$-multiplicative approximation to $\text{trace}(A)$ with failure probability $\delta$ on positive-semidefinite input matrices $A$. Recently, the Hutch++ algorithm was proposed, which reduces the number of matrix-vector queries from $O(1/\epsilon^2)$ to the optimal $O(1/\epsilon)$, and the algorithm succeeds with constant probability. However, in the high probability setting, the non-adaptive Hutch++ algorithm suffers an extra $O(\sqrt{\log(1/\delta)})$ multiplicative factor in its query complexity. Non-adaptive methods are important, as they correspond to sketching algorithms, which are mergeable, highly parallelizable, and provide low-memory streaming algorithms as well as low-communication distributed protocols. In this work, we close the gap between non-adaptive and adaptive algorithms, showing that even non-adaptive algorithms can achieve $O(\sqrt{\log(1/\delta)}/\epsilon + \log(1/\delta))$ matrix-vector products. In addition, we prove matching lower bounds demonstrating that, up to a $\log \log(1/\delta)$ factor, no further improvement in the dependence on $\delta$ or $\epsilon$ is possible by any non-adaptive algorithm. Finally, our experiments demonstrate the superior performance of our sketch over the adaptive Hutch++ algorithm, which is less parallelizable, as well as over the non-adaptive Hutchinson's method.

algorithm, delta, optimal sketching, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Optimal Sketching for Trace Estimation

Neural Information Processing SystemsJan-19-2025, 03:26:08 GMT

Matrix trace estimation is ubiquitous in machine learning applications and has traditionally relied on Hutchinson's method, which requires O(\log(1/\delta)/\epsilon 2) matrix-vector product queries to achieve a (1 \pm \epsilon) -multiplicative approximation to \text{trace}(A) with failure probability \delta on positive-semidefinite input matrices A . Recently, the Hutch algorithm was proposed, which reduces the number of matrix-vector queries from O(1/\epsilon 2) to the optimal O(1/\epsilon), and the algorithm succeeds with constant probability. However, in the high probability setting, the non-adaptive Hutch algorithm suffers an extra O(\sqrt{\log(1/\delta)}) multiplicative factor in its query complexity. Non-adaptive methods are important, as they correspond to sketching algorithms, which are mergeable, highly parallelizable, and provide low-memory streaming algorithms as well as low-communication distributed protocols. In this work, we close the gap between non-adaptive and adaptive algorithms, showing that even non-adaptive algorithms can achieve O(\sqrt{\log(1/\delta)}/\epsilon \log(1/\delta)) matrix-vector products. In addition, we prove matching lower bounds demonstrating that, up to a \log \log(1/\delta) factor, no further improvement in the dependence on \delta or \epsilon is possible by any non-adaptive algorithm.

algorithm, delta, optimal sketching, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Near Optimal Sketching of Low-Rank Tensor Regression

Li, Xingguo, Haupt, Jarvis, Woodruff, David

Neural Information Processing SystemsFeb-14-2020, 13:12:59 GMT

This problem is motivated by the fact that the number of parameters in $\Theta$ is only $R \cdot \sum_{d 1} D p_D$, which is significantly smaller than the $\prod_{d 1} {D} p_d$ number of parameters in ordinary least squares regression. We consider the above CP decomposition model of tensors $\Theta$, as well as the Tucker decomposition. We obtain a significantly smaller dimension and sparsity in the randomized linear mapping $\Phi$ than is possible for ordinary least squares regression. Finally, we give a number of numerical simulations supporting our theory. Papers published at the Neural Information Processing Systems Conference.

low-rank tensor regression, optimal sketching, square regression, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.44)

Add feedback